منابع مشابه
Morphemes and Lexemes versus “ Morphemes or Lexemes ? ”
More than a century after the first linguistic definition of the notion of morpheme by Baudoin de Courtenay (1895) and Sweet (1876), an ever-lasting debate – which I shall refer here as the “Morpheme or Lexeme” (M or L) debate – on the nature of linguistic bricks is still going on. Since a by-product of this debate is terminological confusion in the use of the four notions of morpheme, lexeme, ...
متن کاملUnsupervised Discovery of Morphemes
We present two methods for unsupervised segmentation of words into morphemelike units. The model utilized is especially suited for languages with a rich morphology, such as Finnish. The first method is based on the Minimum Description Length (MDL) principle and works online. In the second method, Maximum Likelihood (ML) optimization is used. The quality of the segmentations is measured using an...
متن کاملExtracting Morphemes without #
In unsupervised morphology learning algorithms (Goldsmith, 2000) (Cavar et al, 2004), it is assumed that words are already segmented by white space. However, it is somewhat obvious that white space between words are not pronounced in spoken English. While word boundaries are indicated by white spaces in written English, speakers of English do not pronounce anything particular for space or they ...
متن کاملUnsupervised Morphemes Segmentation
In this work, we describe the algorithm adopted to split the words into smallest possible meaningful units or morphemes. The algorithm is unsupervised and not dependent on any language. The model is developed using English language. However, the linguistic rules specific to English language are not implemented. The algorithm focuses on the identification of smallest units of words based on thei...
متن کاملUnsupervised Discovery of Persian Morphemes
This paper reports the present results of a research on unsupervised Persian morpheme discovery. In this paper we present a method for discovering the morphemes of Persian language through automatic analysis of corpora. We utilized a Minimum Description Length (MDL) based algorithm with some improvements and applied it to Persian corpus. Our improvements include enhancing the cost function usin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Purdue Undergraduate Research
سال: 2019
ISSN: 2158-4044,2158-4052
DOI: 10.5703/1288284316965